Visualizing Statistical Mix Effects and Simpson's Paradox

نویسندگان

  • Zan Armstrong
  • Martin Wattenberg
چکیده

We discuss how "mix effects" can surprise users of visualizations and potentially lead them to incorrect conclusions. This statistical issue (also known as "omitted variable bias" or, in extreme cases, as "Simpson's paradox") is widespread and can affect any visualization in which the quantity of interest is an aggregated value such as a weighted sum or average. Our first contribution is to document how mix effects can be a serious issue for visualizations, and we analyze how mix effects can cause problems in a variety of popular visualization techniques, from bar charts to treemaps. Our second contribution is a new technique, the "comet chart," that is meant to ameliorate some of these issues.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simpson's Paradox and Cornfield’s Conditions

Simpson's Paradox occurs when an observed association is spurious – reversed after taking into account a confounding factor. At best, Simpson's Paradox is used to argue that association is not causation. At worst, Simpson's Paradox is used to argue that induction is impossible in observational studies (that all arguments from association to causation are equally suspect) since any association c...

متن کامل

Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon – the reversal paradox

This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon--the reversal paradox-...

متن کامل

Revisiting Simpson’s Paradox: statistically warranted vs. unwarranted inference results

The primary objective of this paper is to revisit Simpson’s paradox using a statistical misspecification perspective. It is argued that the reversal of statistical associations is sometimes spurious, stemming from invalid probabilistic assumptions imposed on the data. The concept of statistical misspecification is used to formalize the vague term ‘spurious results’ as ‘statistically untrustwort...

متن کامل

The Inverse Simpson Paradox ( How to win without overtly cheating )

Given two sets of data which lead to a similar statistical conclusion, the Simpson Paradox [10] describes the tactic of combining these two sets and achieving the opposite conclusion. Depending upon the given data, this may or may not succeed. Inverse Simpson is a method of decomposing a given set of comparison data into two disjoint sets and achieving the opposite conclusion for each one. This...

متن کامل

How Likely is Simpson's Paradox in Path Models?

Simpson’s paradox is a phenomenon arising from multivariate statistical analyses that often leads to paradoxical conclusions; in the field of e-collaboration as well as many other fields where multivariate methods are employed. We derive a general inequality for the occurrence of Simpson’s paradox in path models with or without latent variables. The inequality is then used to estimate the proba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE transactions on visualization and computer graphics

دوره 20 12  شماره 

صفحات  -

تاریخ انتشار 2014